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Abstract 

Product Distribution' PD ) theory is a new framework 
for doing distribute d, adaptive control of a multiagent sys- 
tem(MAS). We introduce the technique of “ coordinate trans- 
formations ” in PD theory gradient descent. These transfor- 
mations selectively couple a few agents with each other into 
"meta-agents”. Intuitively , this can be viewed as a general- 
ization of forming binding contracts between those agents . 
Doing this sacri£ces a bit of the distributed nature of the 
MAS, in that there must now be communication from multi- 
ple agents in determining what joint-move is £nally imple- 
mented. However, as we demonstrate in computer experi- 
ments, these transformations improve the the performance 
of the MAS. 


1. Introduction 

Product Distribution (PD) theory is a recently intro- 
duced broad framework for analyzing, controlling, and 
optimizing distributed systems [8, 9, 10]. Among its po- 
tential applications are adaptive, distributed control of 
a Multi-Agent System (MAS), (constrained) optimiza- 
tion, sampling of high-dimensional probability densities 
(i.e., improvements to Metropolis sampling), density es- 
timation, numerical integration, reinforcement learning, 
information-theoretic bounded rational game theory, popu- 
lation biology, and management theory. Some of these are 
investigated in [1, 2, 7]. 

Here we investigate PD theory’s use for adaptive, dis- 
tributed control of a MAS. Typically such control is done 
by having each agent run its own reinforcement learning al- 
gorithm [3, 11, 12, 13]. In this approach the utility func- 
tion of each agent is based on the world utility G(x) map- 
ping the joint move of the agents, a; € I, to the perfor- 
mance of the overall system. However in practice the agents 
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in a MAS are bounded rational. Moreover the equilibrium 
they reach will typically involve mixed strategies rather than 
pure strategies, i.e., they don’t settle on a single point x op- 
timizing G(x). This suggests formulating an approach that 
explicitly accounts for the bounded rational, mixed strategy 
character of the agents. 

Now in any game, bounded rational or otherwise, the 
agents are independent, with each agent i choosing its move 
Xi at any instant by sampling its probability distribution 
(mixed strategy) at that instant, qi(xi). Accordingly, the 
distribution of the joint-moves is a product distribution, 
P(x) = Yii I n this representation of a MAS, all cou- 

pling between the agents occurs indirectly; it is the separate 
distributions of the agents { g* } that are statistically coupled, 
while the actual moves of the agents are independent 

PD theory adopts this perspective to show that the equi- 
librium of a MAS is the minimizer of a Lagrangian C(P ), 
derived using information theory, that quanti£es the ex- 
pected value of G for the joint distribution P(x). From this 
perspective, the update rules used by the agents in RL-based 
systems for controlling MAS’s are just particular (ineffi- 
cient) ways of fnding that minimizing distribution. PD the- 
ory suggests novel ways to £nd the equilibrium, e.g., ap- 
plying any of the powerful search techniques for continu- 
ous variables, like gradient descent, to £nd the P optimiz- 
ing L. By casting the problem this way in terms of £nd- 
ing an optimal P rather than £nding an optimal x, we can 
exploit the power of search techniques for continuous vari- 
ables even when AT is a discrete, £nite space. 

One disadvantage of using technique such as descent is 
the possibility to be trapped in a local minimum. To be able 
to escape from a local minimum, we explore in the paper the 
possibility to perform a change of semi-coordinate (semi is 
used since the transformation needs not be invertible). To 
start this study, we experiments local change between two 
agents and study how it can produce an improvement In 
the next section we review the game-theory motivation of 
PD theory. Then, we present the concept of semi-coordinate 
transformation and we present results to show that it can im- 





prove the results signi£cantly. 

2. Bounded Rational Game Theory 

In this section we motivate PD theory as the information- 
theoretic formulation of bounded rational game theory- 

2.1. Review of noncooperative game theory 

In noncooperative game theory one has a set of N play- 
ers. Each player i has its own set of allowed pure strate- 
gies. A mixed strategy is a distribution qi{xi) over player 
Vs possible pure strategies. Each player i also has a private 
utility function that maps the pure strategies adopted by 
all N of the players into the real numbers. So given mixed 
strategies of all the players, the expected utility of player i 
is E{jgi) = / dx rij qj(?j)gi(x) 

In a Nash equilibrium every player adopts the mixed 
strategy that maximizes its expected utility, given the mixed 
strategies of the other players. More formally, Vi, q r — 
argmaXg/ J dx q< fl i#i qj(xj) p»(x). Perhaps the major ob- 
jection that has beeiTraised to the Nash equilibrium con- 
cept is its assumption of full rationality [4, 5]. This is 
the assumption that every player i can both calculate what 
the strategies q^i will be and then calculate its associated 
optimal distribution- In other words, it is the assumption 
that every player will calculate the entire joint distribution 
q(x) = If f° r no oti3er reasons than computa- 

tional limitations of real humans, this assumption is essen- 
tially untenable. 

2.2. Review of the maximum entropy principle 

Shannon was the £rst person to realize that based on any 
of several separate sets of very simple desiderata, there is a 
unique real- valued quanti£cation of the amount of syntac- 
tic information in a distribution P(y ). He showed that this 
amount of information is (the negative of) the Shannon en- 
tropy of that distribution, S(P) = - f dy P{y)ln\^~j]- 
So for example, the distribution with minimal information 
is the one that doesn’t distinguish at all between the various 
y, i.e., the uniform distribution. Conversely, the most infor- 
mative distribution is the one that speci£es a single possi- 
ble y. Note that for a product distribution, entropy is addi- 

tive, i.e., S(P[i Qi(Vi)) = Ei S{Qi)- 

Say we given some incomplete prior knowledge about 
a distribution P{y). How should one estimate P(y) based 
on that prior knowledge? Shannon’s result tells us how to 
do that in the most conservative way: have your estimate of 
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P(y) contain the minimal amount of extra information be- 
yond that already contained in the prior knowledge about 
P(y). Intuitively, this can be viewed as a version of Oc- 
cam’s razor. This approach is called the maximum entropy 
(maxent) principle. It has proven useful in domains rang- 
ing from signal processing to supervised learning [6]. 

2.3. Maxent Lagrangians 

Much of the work on equilibrium concepts in game the- 
ory adopts the perspective of an external observer of a game. 
We are told something concerning the game, e.g., its utility 
functions, information sets, etc., and from that wish to pre- 
dict what joint strategy will be followed by real- world play- 
ers of the game. Say that in addition to such information, 
we are told the expected utilities of the players. What is our 
best estimate of the distribution q that generated those ex- 
pected utility values? By the maxent principle, it is the dis- 
tribution with maximal entropy, subject to those expectation 
values. 

To formalize this, for simplicity assume a £nite num- 
ber of players and of possible strategies for each player. 
To agree with the convention in other £elds, from now on 
we implicitly nip the sign of each gi so that the associated 
player i wants to minimize that function rather than maxi- 
mize it Intuitively, this nipped <?i(x) is the “cost” to player 
i when the joint-strategy is x , though we will still use the 
term “utility”. 

Then for prior knowledge that the expected utilities of 
the players are given by the set of values {€*}, the max- 
ent estimate of the associated q is given by the minimizer of 
the La gran gi an 

£(<?) 3 -*]-$(?) 

i 

= J2^[f dx _ - 

i 3 

where the subscript on the expectation value indicates that 
it evaluated under distribution q, and the {ft} are “inverse 
temperatures” implicitly set by the constraints on the ex- 
pected utilities. 

Solving, we £nd that the mixed strategies minimizing the 
Lagrangian are related to each other via 

qi(xi) oc e~ Bq w (Gr{Xi} (1) 

where the overall proportionality constant for each i is set 
by normalization, and G ~ ft<?i 1 2 - In Eq. 1 the probabil- 
ity of player i choosing pure strategy x* depends on the ef- 
fect of that choice on the utilities of the other players. This 


2 The subscript q ^ on the expectation value indicates that it is evalu- 
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renects the fact that our prior knowledge concerns all the 
players equally. 

If we wish to focus only on the behavior of player i , it is 
appropriate to modify our prior knowledge. To see how to 
do this, £rst consider the case of maximal prior knowledge, 
in which we know the actual joint- strategy of the players, 
and therefore all of their expected costs. For this case, triv- 
ially, the maxent principle says we should “estimate” q as 
that joint- strategy (it being the q with maximal entropy that 
is consistent with our prior knowledge). The same conclu- 
sion holds if our prior knowledge also includes the expected 
cost of player i . 

Modify this maximal set of prior knowledge by remov- 
ing from it specifcation of player Vs strategy. So our prior 
knowledge is the mixed strategies of all players other than 
V together with player Vs expected cost We can incorpo- 
rate prior knowledge of the other players’ mixed strategies 
directly, without introducing Lagrange parameters. The re- 
sultant maxent Lagrangian is 


A(?«) 


Pi\ti - [ dx Y[qj(xj)gi(x)] - Site) 


For each player i define 


Then we can maxent Lagrangian for player i is 

£«($) = J dxq(x)fi(x,qi(xi)). (4) 

Now in a bounded rational game every player sets its strat- 
egy to minimize its Lagrangian, given the strategies of the 
other players. In light of Eq. 4, this means that we inter- 
pret each player in a bounded rational game as being per- 
fectly rational for a utility that incorporates its computa- 
tional cost To do so we simply need to expand the domain 
of “cost functions” to include probability values as well as 
joint moves. 

Often our prior knowledge will not consist of exact spec- 
ification of the expected costs of the players, even if that 
knowledge arises from watching the players make their 
moves. Such alternative kinds of prior knowledge are ad- 
dressed in [9, 10]. Those references also demonstrate the 
extension of the formulation to allow multiple utility func- 
tions of the players, and even variable numbers of players. 


solved by a set of coupled Boltzmann distributions: 

qi{xi)oce~ 0iB ^ {9ilxi) . ( 2 ) 

Following Nash, we can use Brouwer’s £xed point theorem 
to establish that for any non-negative values {/?}, there must 
exist at least one product distribution given by the product 
of these Boltzmann distributions (one term in the product 
for each i). 

The £rst term in is minimized by a perfectly ratio- 
nal player. The second term is minimized by a perfectly 
irrational player, i.e., by a perfectly uniform mixed strat- 
egy qi. So Pi in the maxent Lagrangian explicitly specifies 
the balance between the rational and irrational behavior of 
the player. In particular, for P — ♦ oo, by minimizing the La- 
grangians we recover the Nash equilibria of the game. More 
formally, in that limit the set of q that simultaneously min- 
imize the Lagrangians is the same as the set of delta func- 
tions about the Nash equilibria of the game. The same is 
true for Eq. 1. 

Eq. 1 is just a special case of Eq. 2, where all player’s 
share the same private utility, G. (Such games are known 
as team games.) This relationship renects the fact that for 
this case, the difference between the maxent Lagrangian and 
the one in Eq. 1 is independent of q . Due to this relation- 
ship, our guarantee of the existence of a solution to the set 
of maxent Lagrangians implies the existence of a solution 
of the form Eq. 1. Typically players a will be closer to min- 
imizing their expected cost than maximizing it. For prior 
knowledge consistent with such a case, the Pi are all non- 
negative. 


3. Optimizing the Lagrangian and Algorithm 

First we introduce the shorthand 
\G\xi] = E{G\xi) 

dx'5{x'i - if)G(x) Pfqi(x-), 

j^i 

where the delta function forces x\ = Xi in the usual way. 
Now given any initial q , one may use gradient descent to 
search for the q optimizing C(q). Taking the appropriate 
partial derivatives, the descent direction is given by 

Aftfo) = J77Z \ = + 1 lo S ?( x <) + C ( 5 ) 

6qi[Xi) 

where C is a constant set to preserve the norm of the prob- 
ability distribution after update, i.e., set to ensure that 

J dxiqi{xi) = J dXi(qi(xi) + Aqi(xi)) = 1 . ( 6 ) 

Evaluating, we £nd that 

C = - J <**i{[GN + r 1 loggi(xi)}. (7) 

(Note that for £nite X , those integrals are just sums.) 

To follow this gradient, we need an efficient scheme for 
estimation of the conditional expected G for different 
Here we do this via Monte Carlo sampling, i.e., by repeat- 
edly IID sampling q and recording the resultant private util- 
ity values. After using those samples to form an estimate of 



the gradient for each agent, we update the agents’ distribu- 
tions accordingly. We then start another block of IID sam- 
pling to generate estimates of the next gradients. 

The algorithm is provided in the Algorithm l 34 . 

Algorithm 1 Gra dient Descent on the Lagrangian 

while System has not converge do 
create L Monte Carlo samples 
{(i.e. samples the probability distribution of each 
agents)} 

for each of the L samples do 
compute the world utility G 
compute the reward of each coordinate (Team 
Game, AU, WALL.) 

end for 

for each of the N coordinates do 

compute the component of the gradient 
update the probability distribution 

end for 
end while 


3.1. Semi-Coordinates transformation 

Let assume we are a system designer of a MAS. How do 
we de£ne the joint-strategies of the agents? Let us present a 
trivial example. Let us consider two agents, R and C, which 
have two different actions, denoted 0 and 1. The four differ- 
ent states are distinct with four different payoffs. In this con- 
text, what we call semi-coordinates are the different possi- 
ble joint-strategies we can de£ne: (C, R ) state . The 

choice of the mapping, as we shall see, may play an impor- 
tant role. 

Formally, this is expressed via the standard rule for trans- 
forming probabilities, 

P(z) = I dxP(x)5(z — ((x)), 

where C(-) is the mapping from x to z. To see what this 
rule means geometrically, let V be the space of all distri- 
butions (product or otherwise) over z’s. Let Q be the space 
of all product distributions over x. Let C(Q) be its image 
in V. Then by changing £(.), we change that image; differ- 
ent choices of ((.) will result in different manifolds C(Q)* 

In £gure 1, we present two different semi-coordinates: 
z consists of the possible joint strategies, labelled 
(1,1), (1,2), (2,1) and (2,2). Have the space of pos- 
sible x equal the space of possible z, and choose 


C(l> 1) - (1,1), C(l,2) - (2,2), C(2, 1) = (2,1), 

and C( 2 ,2) = (1,2). Say that q is given by 

qi(x\ = 1) = qo(x 2 = 1) = 2/3. Then the distribu- 
tion over joint-strategies z is P(z = (1, 1)) = P(x = 
(1,1)) = 4/9, P(z - (2,1)) - P(z = (2,2)) = 2/9, 
P(z = (1,2)) = 1/9. So P(z) ^ P(zi)P(z 2 )\ the strate- 
gies of the players are statistically coupled. 


R 

1 2 

Cl (1,1) (1,2) 

2 ( 2 , 1 ) ( 2 , 2 ) 


R 

1 2 

Cl (1,1) (2,2) 

2 ( 2 , 1 ) ( 1 , 2 ) 


Figure 1. Two different semi-coordinates in a 
2 by 2 game 


There are different goals associated with the idea of 
semi-coordinate transformation. One is to allow us to £nd 
a good coordinate system to start with or to be re-used in 
later searches. A search of a good coordinate system would 
then be seen as a prepocessing stage before solving a prob- 
lem. Another is that, in the context of a descent, the system 
is likely to fall into a local minimum. Changing coordinates 
might allow to escape from it Assume the system reached 
a local minimum, and let us assume we perform a coordi- 
nate transformation: in the new coordinate system, the land- 
scape of the function will change shape, but the system is 
still at the same position. We hope that this change will cre- 
ate a new direction to keep on descending. By iterating the 
process, we believe that we should reach the global mini- 
mum. 

3.2. Example 

Let assume for now that we only have two different pay- 
offs, a high value (H) and a low value (L). The ultimate goal 
of the agent is to maximize the payoff. We present in Fig- 
ure 2 two different game matrices corresponding to two dif- 
ferent coordinate systems for the same problem. 


Matrix 1 Matrix 2 


0 

H 

L 

6 

L 

L 

1 

L 

H 

1 

H 

H 

actions 

0 

1 

actions 

0 

1 


3 The stopping criteria is for now only based on the change of the prob- 
ability distribution: if the change in probability falls under a £xed 
threshold, we stop the algorithm. 

4 the step size is not held £xed. We perform a line search on the step 
size to ensure that L decreases, if it does not, we reduce the step size. 


Figure 2. two different coordinate systems 
yielding to two different games 



Using Matrix- 1, a reinforcement learning algorithm will 
converge to a £xed strategy to get H. But if we change co- 
ordinate and consider Matrix 2, the problem become easier 
to solve because the effort of the coordination is less. More- 
over, it is possible to improve the entropy term, if the agents 
try to maximize the sum of the payoff and the entropy, R 
converges to play action 1 and C converges to a mixed strat- 
egy Increasing the entropy may enable us to be more 
nexible. 

3.3. Extension from a two players game 

Assume now that we have N coordinates with binary ac- 
tions. Recall we need to form the mapping, i.e. how do we 
set the joint actions of the agents. Searching over all pos- 
sible transformations is not feasible (2 N ! possibilities). We 
have not developed any theory to £nd a good coordinate sys- 
tem. 

In this paper, we explore some preliminary techniques 
to make changes between two coordinates during the gradi- 
ent descent proposed in Algorithm 1. We are going to try to 
make things better between two coordinates, hoping that it 
will not turns things worse with the other coordinates. This 
can be seen as two agents trying to collaborate by exchang- 
ing information in order to improve the system. 

Such coupling of the players’ strategies can be viewed 
as a manifestation of sets of potential binding contracts. To 
illustrate this return to our two player example from Fig- 
ure 1. Each possible value of a component x* determines a 
pair of possible joint strategies. For example, setting x\ — 1 
means the possible joint strategies are (1, 1) and (2, 2). Ac- 
cordingly such a value of Xi can be viewed as a set of pre- 
ferred binding contracts. The value of the other components 
of x determines which contract is accepted; it is the inter- 
section of the preferred contracts offered by all the compo- 
nents of x that determines what single contract is selected. 
Continuing with our example, given that x x = 1, whether 
the joint-strategy is (1, 1) or (2,2) (the two options offered 
by Xi) is determined by the value of x 2 . 

Binding contracts are a central component of coopera- 
tive game theory. In this sense, semi-coordinate transfor- 
mations can be viewed as a way to convert noncooperative 
game theory into a form of cooperative game theory. 

While the distribution over x uniquely sets the distribu- 
tion over z, the reverse is not true. However so long as our 
Lagrangian directly concerns the distribution over x rather 
than the distribution over z, by minimizing that Lagrangian 
we set a distribution over z. In this way we can minimize 
a Lagrangian involving product distributions, even though 
the the associated distribution in the ultimate space of inter- 
est is not a product distribution. 

In practice, when we change to change the coordinate 
system, we £rst choose (randomly) two coordinates. Then, 


we decide on the de£nition of the joint move space of these 
two agents. In other words, if each agents have p actions, 
we need to allocate the p 2 de£nition of the joint actions. For 
example, in the case where the actions are binary, the four 
joint actions labelled a, b , c and d. A shufrie is presented in 
the Figure 3. Each agent is keeping its probability distribu- 
tion over its action space, hence the probabilities that each 
joint actions occur has changed. In the example, the prob- 
ability of the joint action b does not change, whereas the 
probability of joint action a, which was ^#(0) * pc( 0) is 
now ph( 1) *pc(l)* 



£(0) p c (l) gO) £(1) 


previous coordinate system new coordinate system 

Figure 3. Example of the de£nition of the joint 
action space of the two agents taking part in 
the transformation. 


We now present the different way we considered to 
choose the transformation. 

local gradient descent We assume in this part that the size 
of the action space is not too large. For each different 
de£nition of the joint move, we can perform a ’local’ 
gradient descent where we update only the probability 
of the two agents concerned in the transformation. The 
de£nition chosen will be the one with the best value of 
the world utility. We will experiment the possibility to 
re-use or not the new probabilities of the agents. Since 
only two agents are concerned, this should be very fast, 
but we need to perform a gradient descent for all pos- 
sibles de£nition, which is possible only if the action 
space is small. 

Based on the value of the expected G From the Monte 
Carlo simulation, we can compute an estimate of 
the expected World Utility for the different joint ac- 
tions. We can get the the probability distribution 
of the two agents. This probability distribution re- 
maining £xed during the transformation, one can 
compute the best allocation of the joint moves to opti- 
mize the value of expected world utility G. This will 
ensure that we reach a better value of the world util- 
ity, also, this can be done very fast 

For example, in Figure 4, in the original coordi- 
nate system, the number in the matrix are the expected 




value of the World utility for each joint move. The ex- 
pected world utility is 2.02. If we re-assign the action 
space, the expected world utility can be improved to 
2.26. 
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Figure 4. 


All the curve presented are averaged over several runs. In 
Figure 5, we present results where we used a local gradient 
descent, and we tried out the re-use the new probabilities or 
not. The ring is composed by 20 agents, and the temperature 
is moderate (which means that the agents are not fully ra- 
tional). The performance compared to a simple gradient de- 
scent is important. In Figure 6, we present results where the 
agents are using either a random shuftie, or a shuffce based 
on the expected G. For these experiments, the transforma- 
tion occurred every 10 runs until iteration 100. In his case 
the number of agents is 50. Surprisingly, the random shuf- 
ne has some bene£t over not doing any transformation. It 
seems that the system has changed sufficiently so that the 
system can reach a better minimum. 


We present in Algorithm 2 the algorithm we will use in 
the experiments. Note that we will perform a shuffre each 
time the system is next to convergence, i.e. when the system 
reaches a local minimum. We also ran experiments where 
we perform transformation periodically. 


Algo rithm 2 Gradient Descent with shuifre 

while System has not converge do 
create L Monte Carlo samples 
for each of the L samples do 
compute the world utility G 

compute the reward of each coordinate (Team 
Game, AU...) 

end for 

for each of the N coordinates do 

compute the component of the gradient 
update the probability distribution 
end for 

if change in the probability < threshold then 
choose a pair of agents 
choose a shuffte 

perform the coordinate transformation 

end if 
end while 


4. Experimental Results 

We made some experiments in a simple coordination 
game where the agents are in a ring, and they must pick 
an action which must be opposite to their neighbors. The 
agent are playing in a team game. They do not get to know 
what is the world utility function and they do not get to ob- 
serve the actions of the other players. The agents that are al- 
lowed to change their coordinate are necessarily neighbors. 


Descent m the Lagrangian 
system with 16 agents 
beta= 02 



Figure 5. 


Comparison of shuffle strategy in Gradient Descent 
Temperature = 0.5 
number of agents = 10 



Figure 6. 





5. future work 

We are currently investigating other criteria to decide on 
the shufce to perform. In particular, we are trying to un- 
derstand whether the gradient information can be used. In- 
tuitively, if the system get stuck in a local minimum, we 
need to look for transformation that provides a new possi- 
bility to go downhill. Hence, we are investigating transfor- 
mations that yield some improvement for the expected La- 
grangian and have a potentially important gradient 

Also, we are investigating ways to apply transformation 
on a larger set of agents. In the current implementation, we 
are making only one coordinate transformation between two 
agents. In large systems, it might be dif£cult to see the im- 
provement made by such a local changes. We are investi- 
gating ways to make multiple local change in one iteration. 
Another question is about when does a shuf&e need to be 
performed. 

6. conclusion 

Product Distribution (PD) theory is a recently introduced 
broad framework for analyzing, controlling, and optimizing 
distributed systems [8, 9, 10]. Here we investigate PD the- 
ory’s use for adaptive, distributed control of a MAS. Typi- 
cally such control is done by having each agent run its own 
reinforcement learning algorithm [3, 12, 13, 11]. 

In this approach the utility function of each agent is 
based on the world utility G(x) mapping the joint move of 
the agents, x6l,to the performance of the overall system. 
However in practice the agents in a MAS are bounded ratio- 
nal. Moreover the equilibrium they reach will typically in- 
volve mixed strategies rather than pure strategies, i.e., they 
don’t settle on a single point x optimizing G(x). This sug- 
gests formulating an approach that explicitly accounts for 
the bounded rational, mixed strategy character of the agents. 

PD theory directly addresses these issues by casting the 
control problem as one of minimizing a Lagrangian of the 
joint probability distribution of the agents. This allows the 
equilibrium to be found using gradient descent techniques. 
In PD theory, such gradient descent can be done in a dis- 
tributed manner. 

We present experiments where we perform semi- 
coordinate transformation, that is changing the de£nition 
of the joint strategies of the agent during the gradient de- 
scent. The experimental results shows that these transfor- 
mations are helpful to improve the speed of convergence 
and improve the quality of the equilibrium found by escap- 
ing local minima. It is interesting to notice that, by making 
several local changes in the system, we can affect the per- 
formance of the overall system. These preliminary results 
are encouraging. 
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